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REMARKS 

Claims 1-17 are pending in the application. Claims 1-3, 7, 14 and 15 are rejected. 
Claims 4-6, 8-13, 16 and 17 are allowable. All rejections and objections are 
respectfully traversed. 

The invention summarizes a compressed video. Audio peaks are detected in an 
audio signal of the video. Motion activity in the video is quantized as a continuous 
stream of pulses and the audio peaks are correlated with the stream of quantized 
pulses to identify uninteresting events and interesting events in the video to 
summarize the video. 

Claims 1 and 14 are rejected under 35 U.S.C. 102(e) as being anticipated by 

Divakaran et al. (6,763,069 - "Divakaran"). 

Divakaran operates on audio lables in the digital domain. The invention operates 
on audio signals in the analog domain. Divakaran cannot anticipate the invention. 
Processing analog signals is unrelated to processing digital signals. 

Divakaran extracts high-level features from a video including a sequence of 
frames. First, low-level features are extracted from each frame of the video. Each 
frame of the video is then labeled digitally according to the extracted low-level 
features to generate sequences of labels. Each sequence of labels is associated with 
one of the extracted low-level features. The sequences of labels are analyzed using 
learning machine learning techniques to extract high-level features of the video 
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Divakaran extracts low-level features form a video, such as color features, motion 
features, and audio features. Frames of the video are labeled according to the 
extracted features. 

Claimed is detecting audio peaks in an analog audio signal of the video, see the 44 
KHz audio signal and 1 KHz volume contour in Figure 1. A person of ordinary 
skill in the art would readily understand that extracting audio features for digital . 
labels, such as 'loud' is a different operation than detecting analog audio peaks. 
Item 203 of Divakaran shows digital labels (Quiet, Noisy, and Loud) for a 
sequence of audio frames. Those digital labels do not indicate analog peaks in the 
audio signal as claimed. It appears that the Examiner is confusing "Loud" labels . 
with peaks in analog audio signal magnitude. Divakaran never detects audio peaks 
in an audio signal as claimed. 

The invention quantizes motion activity in the video as a continuous stream of 
pulses. The Examiner's entire rejection of this element is "(202)," without any 
further explanation. 

MPEP 2131 explicitly states that in order to anticipate a claim, each and every 
element as set forth in the claims must be found in the prior art reference. The 
identical invention must be shown in as complete detail as is contained in the 
claim. Item 202 in Figure 2 of Divakaran only shows label values based on motion. 
The Examiner is requested to specifically point out which words in Divakaran 
mean quantize, quantize motion activity, pulse, or continuous stream of pulses. In 
fact, none of these words appear in Divakaran. The Examiner's rejection ignores 
explicit limitations recited in the claims. 
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So far, Divakaran fails to teach, suggest, describe or show detecting audio peaks, or 
quantizing motion activity as a continuous stream of pulses. The invention 
correlates the audio peaks with the stream of quantized pulses to identify 
uninteresting events and interesting events in the video to summarize the video. 



The examiner points to Figure 1 and col. 5, lines 31-57 of Divakaran, without 
further explanation. > : 

FIG. 2 shows a sequence of frames (1— N) 101, and three 
labels sequences 201, 202, and 203. The label values (Red, 
Green, and Blue) of the sequence 201 are based on color 
features, the label values. Medium, and Fast) of the sequence 
202 are based on motion features, and the label values 35 
(Noisy, Lx)ud) of the sequence 203 are audio features. Note 
that in this example, the boundaries of clusters of labels are 

not always time aligned. The manner in which the labeling . i 

coincides or transitions can be indicate of different semantic 

meanings. For example, when there is a long pan, there 40 s 

might be an apparent scene change during the panning so 

that the color changes but motion does not. Also when an 

object in the scene changes motion suddenly, there may be 

motion change without color change. Similarly, the audio 

labels can remain constant while the color labels chjuige. For 45 

example, in a football video, slow motion followed by fast 

motion on a green field, followed by a pan of a flesh colored 

scene accompanied by loud noise can be classified as a 

"scoring" event. 

Note, our clustering according to sequences of labels is 50 
quite different than the prior art segmentation of a video into 
shots. Our clusters are according different labels, the bound- 
aries of clusters with different labels may not be time 
aligned. This is not case in traditional video segmentation. 
We analyze not only label boundaries per se, but also the 55 
time aligned relationship among the various labels, and the 
transitional relations of the labels. 



The Examiner is requested to specify exactly which words above means correlate 
audio peaks with the stream of quantized pulses to identify uninteresting events 
and interesting events in the video to summarize the video. Divakaran clusters 
frames according to sequences of labels. The invention correlates audio peaks with 
quantized pulses of motion activity. 
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It would be readily apparent to a person of ordinary skill in the art that Divakaran 
fails to describe any of the elements of what is claimed. Divakaran can never 
anticipate what is claimed. Therefore, the applicants respectfully request the 
Examiner reconsider and withdraw his rejection of claims 1 and 14 based on 
Divakaran. 

Claims 2, 7 and 1 5 are rejected under 35 U.S.C. 1 03(a) as being unpatentable over 
Divakaran et al. (6,763,069). 

In claims 2 and 1 5, frames of the video associated with the uninteresting events are i 
discarded and frames of the video associated with the interesting events to form a 
summary of the video are concatenated. Divakaran clusters frames according to 
sequences of labels associated with low-level features, see col. 5, lines 50-57. The 
invention associates frames of the video with interesting or uninteresting events 
based on audio peaks correlated with a stream of quantized pulses of motion 
activity. There is absolutely nothing in Divakaran that suggests what is claimed. 
Divakaran cannot make the invention obvious. ; 

Regarding claim 7, the Examiner takes ofQcial notice that measuring an average of 
motion vectors of P-frame to extract motion activity is well known in the art. 
However, the Examiner has failed to produce any prior art reference that describes 
the novel correlating of audio peaks with the stream of quantized pulses to identify 
uninteresting events and interesting events in the video to summarize the video as 
claimed. The Examiner's official notice motion vector extraction lacks any 
context related to the invention, and is irrelevant to what is claimed. 
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Claim 3 is rejected under 35 U.S.C. 103(a) as being unpatentable over Divakaran 
et al. (6,763,069) in view of Hinderks (6,339,756 Bl). 

Hinderks fails to cure the defects of Divakaran. Hinderks describes a ' 
programmable CODEC for compression and decompression of audio signals. The 
Examiner's use of Hinderks as a reference appears to be the result of a keyword 
search only. Hinderks' CODEC has nothing to do with either the invention or 
Divakaran. Hinderks describes processes for compressing an audio signal. In 
particular, the Examiner points to col. 23, lines 11-19, which describe a process for 
allocating bits to a quantizer to reduce noise in the compressed audio signal;' see 
Figure 28. The cited section is so far removed from anything concerning the 
invention the appUcants have no idea what point the Examiner is trying to make. 
The Examiner is requested to explain exactly why a quantizer bit allocation 
process for reducing noise in an audio signal CODEC has anything to do with 
correlating detected audio peaks with the stream of quantized pulses to identify 
uninteresting events and interesting events in the video to summarize the video as : 
claimed. ■•. 

All rejections have been complied with, and applicant respectfully submits that the : 
application is now in condition for allowance. The applicant urges the Examiner to 
contact the applicant's attorney at the phone and address indicated below if 
assistance is required to move the present application to allowance. 
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Please charge any shortages in fees in connection with this filing to Deposit 
Account 50-0749. 



Mitsubishi Electric Research Laboratories, Inc. 
201 Broadway, 8* Floor 
Cambridge, MA 02139 
Telephone: (617) 621-7573 
Facsimile: (617) 621-7550 



Respectfully Submitted, 




Andr^ J. Curtin 
Registration No. 48,485 
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